Guess Who Rated This Movie: Identifying Users Through Subspace Clustering

نویسندگان

  • Amy Zhang
  • Nadia Fawaz
  • Stratis Ioannidis
  • Andrea Montanari
چکیده

It is often the case that, within an online recommender system, multiple users share a common account. Can such shared accounts be identified solely on the basis of the userprovided ratings? Once a shared account is identified, can the different users sharing it be identified as well? Whenever such user identification is feasible, it opens the way to possible improvements in personalized recommendations, but also raises privacy concerns. We develop a model for composite accounts based on unions of linear subspaces, and use subspace clustering for carrying out the identification task. We show that a significant fraction of such accounts is identifiable in a reliable manner, and illustrate potential uses for personalized recommendation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Two-Phase Spectral Bigraph Co-clustering Approach for the “Who Rated What” Task in KDD Cup 2007

This paper describes our approach for the “Who Rated What” task in KDD Cup 2007 competition. Given the Netflix data set that consists of more than 100 million ratings between 1998 and 2005, this task is to predict the probability that each user-movie pair was rated in 2006. Totally 100,000 user-movie pairs are drawn from the Netflix data set as the test set. In our approach, the Netflix data se...

متن کامل

Design of Movie Recommendation System by Means of Collaborative Filtering

The purpose of this study is to develop a ‘Movie Recommendation System’ with the help of Collaborative Filtering approach. For this purpose we have used the Netflix prize dataset which is available for download from www.netflix.com. This dataset contains huge number of files. There are total 480189 users who have rated the movies. For clustering the users in the dataset according to their respe...

متن کامل

How To Break Anonymity of the Netflix Prize Dataset

As part of the Netflix Prize contest, Netflix recently released a dataset containing movie ratings of a significant fraction of their subscribers. The dataset is intended to be anonymous, and all customer identifying information has been removed. We demonstrate that an attacker who knows only a little bit about an individual subscriber can easily identify this subscriber’s record if it is prese...

متن کامل

Scuba Diver: Subspace Clustering of Web Search Results

Current search engines present their search results as a ranked list of Web pages. However, as the number of pages on the Web increases exponentially, so does the number of search results for any given query. We present a novel subspace clustering based algorithm to organize keyword search results by simultaneously clustering and identifying distinguishing terms for each cluster. Our system, na...

متن کامل

Increasing the Accuracy of Recommender Systems Using the Combination of K-Means and Differential Evolution Algorithms

Recommender systems are the systems that try to make recommendations to each user based on performance, personal tastes, user behaviors, and the context that match their personal preferences and help them in the decision-making process. One of the most important subjects regarding these systems is to increase the system accuracy which means how much the recommendations are close to the user int...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012